Inspired by the impressive performance of recent face image editing methods, several studies have been naturally proposed to extend these methods to the face video editing task. One of the main challenges here is temporal consistency among edited frames, which is still unresolved. To this end, we propose a novel face video editing framework based on diffusion autoencoders that can successfully extract the decomposed features - for the first time as a face video editing model - of identity and motion from a given video. This modeling allows us to edit the video by simply manipulating the temporally invariant feature to the desired direction for the consistency. Another unique strength of our model is that, since our model is based on diffusion models, it can satisfy both reconstruction and edit capabilities at the same time, and is robust to corner cases in wild face videos (e.g. occluded faces) unlike the existing GAN-based methods.
translated by 谷歌翻译
联合学习(FL)是一个活跃的研究领域。采用FL的最合适区域之一是医疗领域,必须尊重患者隐私。但是,先前的研究并未完全考虑谁最有可能在医疗领域使用FL。渴望采用FL的不是医院,而是想要开发具有真实患者记录的机器学习模型的服务提供商。此外,服务提供商希望以最低成本的可能性来最大程度地提高模型的性能。在这项工作中,我们提出了FL方法的经验基准,考虑了三个现实世界数据集的性能和货币成本:电子健康记录,皮肤癌图像和心电图数据集。我们还建议使用近端正则化的联合学习,除了局部归一化(FEDPXN),该学习使用FEDPROX和FEDBN的简单组合优于所有其他FL算法,而仅消耗比最高效率的方法稍大一些。
translated by 谷歌翻译
由于相邻的节点之间的相互作用,在类不平衡的图形数据下学习无偏的节点表示具有挑战性。现有研究的共同点是,它们根据其总数(忽略图中的节点连接)来补偿次要类节点“作为组”,这不可避免地增加了主要节点的假阳性病例。我们假设这些假阳性病例的增加受到每个节点周围的标签分布的高度影响,并通过实验确认。此外,为了解决这个问题,我们提出了拓扑意识的利润率(TAM),以反映学习目标的本地拓扑。我们的方法将每个节点的连通性模式与类平均反向零件进行比较,并根据此相应地适应边缘。我们的方法始终在具有代表性GNN体系结构的各种节点分类基准数据集上表现出优于基线的优势。
translated by 谷歌翻译
混合方案表明混合一对样品以创造增强的训练样本,并最近获得了相当大的关注,以提高神经网络的普遍性。混合的简单和广泛使用的扩展是与区域辍学方法相结合:从样品中除去随机贴片并用另一个样品的特征替换。尽管它们的简单性和有效性,但这些方法易于由于它们的随机性而产生有害样品。为了解决这个问题,最近提出了“最大显着性”策略:只选择最具信息性的功能以防止这种现象。然而,他们现在缺乏样品多样化,因为它们总是确定具有最大显着性的区域,将偏置注入增强数据。在本文中,我们展示了一种新颖,简单的混合变体,捕获了两个世界的最佳变化。我们的想法是两倍。通过将特征的随机抽查和“将它们嫁接到另一个样本”,我们的方法有效地产生了多样化但有意义的样本。其第二种成分是通过以显着校准的方式混合标签来生产接枝样品的标签,这整流了随机抽样程序引入的监督误导。我们在CiFar,微小想象成和Imagenet数据集下的实验表明,我们的方案不仅在分类准确性方面优于当前的最先进的增强策略,但在数据损坏等压力条件下也是优越的对象遮挡。
translated by 谷歌翻译
尽管能够与过度能力网络概括,但深神经网络通常会学会滥用数据中的虚假偏见而不是使用实际的任务相关信息。由于此类快捷方式仅在收集的数据集中有效,因此由此产生的偏置模型在现实世界的投入上表现不佳,或导致意外的社交影响,例如性别歧视。为了抵消偏差的影响,现有方法可以利用辅助信息,这在实践中很少可获得,或者在训练数据中的无偏见样本中筛选,希望能够充分存在清洁样品。但是,这些关于数据的推定并不总是保证。在本文中,我们提出了通过生成偏差变换〜(CDVG)对比下展,该〜(CDVG)能够在现有的方法中经营,其中现有方法由于未偏置的偏差样品而不足的预设而下降。通过我们的观察,不仅如前所述的鉴别模型,而且生成模型倾向于关注偏差,CDVG使用翻译模型来将样本中的偏置转换为另一种偏差模式,同时保留任务相关信息。 。通过对比学习,我们将转化的偏见视图与另一个学习偏见,学习偏见不变的表示。综合和现实世界数据集的实验结果表明,我们的框架优于目前的最先进,并且有效地阻止模型即使在无偏差样本极为稀缺时也会被偏置。
translated by 谷歌翻译
图形结构的数据集通常具有不规则的图表尺寸和连接,渲染使用最近的数据增强技术,例如混合,困难。为了解决这一挑战,我们在名为曲线图移植的图形级别提供了第一个混合图形增强方法,其在数据空间中混合了不规则图。要在图形的各种尺度上定义,我们的方法将子结构标识为可以保留本地信息的混合单元。由于没有特殊考虑上下文的​​基于混合的方法易于产生噪声样本,因此我们的方法明确地使用节点显着信息来选择有意义的子图并自适应地确定标签。我们在多个图形分类基准数据集中广泛地验证了我们多样化的GNN架构,来自不同尺寸的各种图形域。实验结果显示了我们对其他基本数据增强基线的方法的一致优势。我们还证明了曲线图移植在鲁棒性和模型校准方面提高了性能。
translated by 谷歌翻译
We propose a novel deep network architecture for lifelong learning which we refer to as Dynamically Expandable Network (DEN), that can dynamically decide its network capacity as it trains on a sequence of tasks, to learn a compact overlapping knowledge sharing structure among tasks. DEN is efficiently trained in an online manner by performing selective retraining, dynamically expands network capacity upon arrival of each task with only the necessary number of units, and effectively prevents semantic drift by splitting/duplicating units and timestamping them. We validate DEN on multiple public datasets under lifelong learning scenarios, on which it not only significantly outperforms existing lifelong learning methods for deep networks, but also achieves the same level of performance as the batch counterparts with substantially fewer number of parameters. Further, the obtained network fine-tuned on all tasks obtained significantly better performance over the batch models, which shows that it can be used to estimate the optimal network structure even when all tasks are available in the first place.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译